A Simple CW-SSIM Kernel-based Nearest Neighbor Method for Handwritten Digit Classification

نویسندگان

  • Jiheng Wang
  • Guangzhe Fan
  • Zhou Wang
چکیده

We propose a simple kernel based nearest neighbor approach for handwritten digit classification. The "distance" here is actually a kernel defining the similarity between two images. We carefully study the effects of different number of neighbors and weight schemes and report the results. With only a few nearest neighbors (or most similar images) to vote, the test set error rate on MNIST database could reach about 1.5%-2.0%, which is very close to many advanced models. Introduction Due to the high dimensionality nature of digital images, image classification algorithms typically require a feature extraction process (such as corner detection) or an appearance-based dimension reduction stage (such as principle component analysis) before the application of statistical learning and classification algorithms. Meanwhile, there has been some interesting recent progress on defining similarity metrics between two images that are in their original 2D functional form. These include the structural similarity (SSIM) index 1 and its extension – complex wavelet SSIM (CW-SSIM) index 2,3 . Conceptually, these similarity metrics have the potentials to be used in image classification problems, but there has not been sufficient study on how this should be performed in real-world scenarios. Image similarity indices play a crucial role in the development, assessment and optimization of a large number of image processing and pattern recognition systems. An image can be viewed as a 2-D function of intensity. Perhaps the simplest way to compare the similarity of two images is to compute the mean squared error between these two 2D functions. Unfortunately, such a point-wise similarity measure does not take into account the correlation between neighboring image pixels and has been shown to be problematic in many ways 4 . Recently, a substantially different approach called the SSIM index 1 was proposed, where the structural information of an image is defined as those attributes that represent the structures of the objects in the visual scene, apart from the mean intensity and contrast. Thus, the SSIM index separates the comparison of local structural patterns from local mean intensity and contrast comparisons. The SSIM index has shown somewhat surprising success in predicting perceptual image quality when compared with more sophisticated methods based on psychological models of the human visual system 4 . A common drawback of both MSE and SSIM metrics is their high sensitivity to small geometric distortions such as translation, rotation and scaling. The CW-SSIM measure overcomes this problem by transforming SSIM to the complex wavelet transform domain 2,3 . The key idea behind CW-SSIM is that small geometric image distortions lead to consistent phase changes in local wavelet coefficients, and that a consistent phase shift of the coefficients does not change the structural content of the image. The potential of CW-SSIM has been demonstrated with a series of applications, including image quality assessment 2 , digit recognition 2 , line-drawing comparison 3 , segmentation comparison 3 , range-based face recognition 5 and palmprint recognition 6 . The well-known MNIST database of handwritten digits is composed of 60,000 training and 10,000 test examples, where the data were collected among Census Bureau employees and high school students. The original images have a normalized size of 28×28 and contain gray levels for the purpose of anti-aliasing. In this paper we propose a series of simple and fast kernel-based classification algorithms based on CW-SSIM index for the MNIST database, which appears to be effective and reliable tools for the MNIST Database of Handwritten Digits Classification. Although no feature extraction or dimension reduction process is involved, we obtain quite competitive results with only a simple k-NN model. Given that the CW-SSIM index provides a powerful similarity measure between two misaligned images and there are sufficient training examples in the MNIST database, we are able to effectively classify test samples using only the most similar images.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Complex-Wavelet Structural Similarity Based Image Classification

Complex wavelet structural similarity (CW-SSIM) index has been recognized as a novel image similarity measure of broad potential applications due to its robustness to small geometric distortions such as translation, scaling and rotation of images. Nevertheless, how to make the best use of it in image classification problems has not been deeply investigated. In this study, we introduce a series ...

متن کامل

Image classification based on complex wavelet structural similarity

Complex wavelet structural similarity (CW-SSIM) index has been recognized as a novel image similarity measure of broad potential applications due to its robustness to small geometric distortions such as translation, scaling and rotation of images. Nevertheless, how to make the best use of it in image classification problems has not been deeply investigated. In this paper, we introduce a series ...

متن کامل

CW-SSIM Kernel based Random Forest for Image Classification

Complex wavelet structural similarity (CW-SSIM) index has been proposed as a powerful image similarity metric that is robust to translation, scaling and rotation of images, but how to employ it in image classification applications has not been deeply investigated. In this paper, we incorporate CW-SSIM as a kernel function into a random forest learning algorithm. This leads to a novel image clas...

متن کامل

Recognition of Handwritten Arabic (Indian) Numerals using Radon- Fourier-based Features

This paper describes a technique for the recognition of off-line handwritten Arabic (Indian) numerals using Radon-Fourier-based features. A two stage classification scheme is used. The Nearest Mean (NMC), K-Nearest Neighbor (K-NNC), and Hidden Markov Models (HMMC) Classifiers are used in the first stage and a Structural Classifier (SC) is used in the second stage. A database of 44 writers with ...

متن کامل

Pattern Classification Using Weighted Average Patterns of Categorical k-Nearest Neighbors

The recognition rate of the typical nonparametric method “k-nearest neighbor rule (kNN)” is degraded when the dimensionality of feature vectors is large. For reducing this difficulty, Mitani and Hamamoto have proposed a simple and strong classifier that outputs the class of a test sample by measuring the distance between the test sample and the average patterns, which are calculated using k-nea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010